Introduction
The following is intended as a set of tips for people learning how to use Git and GitHub.
Session plan
- Introduction
- Tips
- Short practical on making a pull request
- Elsie - the OpenSAFELY codelist system
Session aims
By the end of the session you should
- have a basic understanding of how Git works
- be able to perform common Git operations using GitHub Desktop and the GitHub web interface, including
- clone a repo from GitHub
- make a new branch
- make commits
- push your branch to GitHub
- make a pull request
Guides to Git and GitHub
There are many excellent guides to Git and GitHub online, e.g.,
- Intro to GitHub here
- GitHub Training & Guides YouTube channel here
- Git documentation and training here
- Hadley Wickham on Git here
- Jenny Bryan on Git and GitHub with R here
And most relevantly the OpenSAFELY documentation here.
These tips are meant to supplement them.
Tips
Intro to Git
- Git was written to allow developers work on the source code of the Linux kernel (text files)
- One kernel release they got in a terrible mess
- This provoked Linus Torvalds into action
- For an excellent insight into his thinking watch this talk he gave at Google here
- (Especially if used at the command line) Git can be intimidating to use and we can get Git errors (which like LaTeX and R errors can be quite cryptic)

- A Git repository is a folder/directory on your computer which has been Git initialised
- Git is commonly referred to as version control software
- Git is better described as a content addressable filesystem which translates to Git tracks the contents of the files in your repo
Git creates a little database of the contents of your files - snapshots (commits) are taken when you tell it to
Git looks for changes in your files when you save them, so when you have unsaved changes in a file/s Git shows no changes until you save

Git takes snapshots of your files - when you tell it to - commits - I saved my file from above, enter a commit message and click “Commit to master” 
Commits are identified by the 40-character checksum SHA-1 hash of the contents of your files at that time



Git knows the state of your files at every commit
- You can easily restore your files to a previous state
For Git the state of your files only changes when their contents change
- If you reopen a file, make no changes, then close it, Git will show no changes in your repo
- If you add an empty folder/directory to your repo Git will show no changes in your repo
- This differs to OneDrive/SharePoint/Google Drive which are file synchronisation systems
I recommend not to place your Git repos in a location that is sync’d by either OneDrive or Google Drive
The .git folder
- When you initialise a directory the
.git folder is created
- This contains all of the files Git uses to track the contents of your files
- Here is the
.git folder of a repo on my computer (I have selected to View hidden files in Windows Explorer)

- Confusingly GitHub hides the
.git folder from view

- Here are its contents - don’t edit these manually

- Explanation of these is (from here)

Common Git commands
- I recommend you use GitHub Desktop instead of these commands
- These commands are what GitHub Desktop is using behind the scenes
- Git is the name of the program,
git is the name of the executable available at your command line
git init
git add <filename>
git status
git commit -m "Your commit message"
git commit --amend -m "Your amended commit message"
git push
git pull
git clone
git branch
git checkout
git merge
git fetch
Installing Git and GitHub Desktop
Installing Git
- Windows
- Download and install from here
- macOS comes with an out-dated version of Git
I recommend installing the Homebrew version
First install Homebrew, see instructions here
Then run in your Terminal app
brew upgrade
brew install git
Additionally on a Mac it is helpful to install Xcode command line tools (i.e., avoid installing the whole of Xcode.)
xcode-select --install
- Must reinstall these everytime upgrade operating system versions, e.g., from Big Sur to Monterey
- Once Git is installed its executable (called
git) should be available at your command line
Check which version you have with (you want something recent-ish)
git --version
On my Windows machine I have
git version 2.33.1.windows.1
Installing GitHub Desktop
- You could use Git through its command syntax however I recommend you use a graphical Git program
- For Windows and macOS download and install GitHub Desktop from here
- A Linux version of GitHub Desktop is available from here
- I recommend installing the free VSCode text editor, from here, and setting that as the “External editor” in GitHub Desktop options (Click: File | Options…)

- On Windows I also recommend installing Windows Terminal from here
Intro to GitHub
GitHub is a Git web server, there are others e.g., GitLab
Your repositories will be stored on GitHub, and you will clone them to your machine to work on them (or work on them in Gitpod)
Under your user account you see the repos you are owner of
On GitHub OpenSAFELY is an organization
- The repos are owned by the organization so they show up under the organisation here

GitHub PAT for R
- To create a GitHub Personal Access Token (PAT) to be allowed more downloads from GitHub per hour run in R
install.packages("usethis")
library(usethis)
create_github_token()
GitHub CLI
- GitHub CLI stands for command line interface for operating GitHub
- Installation instructions are here
- But I don’t recommend using this
Git and GitHub Workflow
Standard GitHub workflow
- (I recommend to only fork a public repo if you intend to send a pull request to it)
- Fork the other person’s repo (this will be known as the
upstream repo from your fork, your copy of a repo on GitHub is known as origin)
- This creates a copy of their repo under your account (your fork)
- Clone your fork (the copy under your account) to your machine
- Create a new branch (do not work on
master/main)
- Make your changes and commit them
- Push your new branch upto your GitHub (i.e., to your fork)
- Create a pull request (from your new branch) back to the default (
master/main) branch of the original repo
Workflow with an OpenSAFELY GitHub repo
- Skip the forking step from the standard GitHub workflow
- The repo on GitHub is known as
origin
- Clone the repo to your local machine
- Click:
Code | Open with GitHub Desktop

- Click
Clone in the box which appears in GitHub Desktop

- Go to making a pull request tab
Making a pull request
- Let’s start by creating a new branch

- We do some work (in VSCode/text editor/RStudio) which creates a markdown file with a title and some text. We then make a new commit which adds this new file to the repo

- Next publish the new branch to GitHub

- Now initiate the creation of the PR by either clicking in GitHub Desktop “Create Pull Request”

- or clicking on the button on the repo webpage “Compare & pull request”

- Edit the title box, add some extra text in the comment box, select a reviewer, and then click “Create pull request”

- You can amend/edit pull requests by modifying/adding commits to the branch from which you sent the PR
- See more about pull request reviews here
- (The reviewer) will then merge your PR

- (The reviewer) will then confirm the merge

- (Optional) Delete the branch the PR came from

- The PR is now finished and we can see the merge commit in the default (
main/master) branch 
- In GitHub Desktop click “Fetch origin”/“Pull origin” to pull the updated
main/master branch down to your local machine … and the process begins again …
OpenSAFELY repositories
- OpenSAFELY is a system of Python packages (opensafely and cohortextractor) which run various Docker containers
- The main GitHub organisation page is here
- All the core code is published in their opensafely-core organisation on GitHub here
- And there is also their opensafely-actions organisation here
- A Docker container is a like a virtual machine
- It defines the operating system and programs running within it
- On my Windows 10 machine I can run an Ubuntu docker container
- Just because an R package is installed in the R installation on your machine does not mean that it is installed in the OpenSAFELY R Docker container
- See the list of packages in the R Docker container here
Demo repo
- Have a look at the demo repo here
Getting started
- See OS page here
- If creating a new repo create from the OS template here

- This is already Git initialized
- Important files
project.yaml
- Defines the jobs and the order in which they run
/analysis/study_defintion.py
- Defines the study population extracted from the OpenSAFELY database
- This should return
.csv file/s of data to read into R
/analysis/##_R-scripts.R
Running jobs (on the dummy data)
- In your OS repo online
- On your own machine - install the following free software
- (If on Windows - Windows Subsystem for Linux version 2)
- Docker Desktop
- Python
- Git
- GitHub Desktop
- VSCode text editor
Additional topics
Avoid making commits with lots changes
- Do not commit changes to many files with a single commit message such as “Edits”!

- Note that in a commit we can see the added lines - green highlight with
+ prefix - and deleted lines - red highligh with - prefix

Writing good commit messages
- Follow the standard recommendations about making commit messages, see
Files for Git to ignore
- You should not commit all files in the folder on your computer into your repo
- The
.gitignore file is a list of files and folders in your repo for Git to ignore
- Common files to ignore are
GitHub repos contain more than just code
- A repo for an R package will probably contain
- The code for the R package
- The code for its website (often made with pkgdown and hosted with GitHub Pages or Netlify)

- Scripts for controlling continuous integration services such as GitHub Actions
Short practical
- On GitHub:
- Go to our test repo (in our test organization) here

- Clone the repo to your local machine
- In GitHub Desktop: make a new branch and switch to it
- In any text editor:
- Create a new markdown file called
yourfirstname-yourlastname.md
- Add a sentence or two to the file about yourself, e.g.,

- Save this file into the (top level of the) repo
- In GitHub Desktop: Commit this new file into your new branch
- In GitHub Desktop: Push your new branch upto GitHub
- On GitHub: Open a pull request from your branch to the
main branch in which you select a reviewer (Tom/Venexia/Elsie)
- In your text editor and GitHub Desktop: Make any changes requested by the reviewer and add these to your PR - hopefully your pull request will then be merged by the reviewer!
- On GitHub: Delete the branch you made your pull request from
- In GitHub Desktop: Pull down the updated master branch to your machine … in a real workflow you would then make another new branch and do more work…